Skip to content

Optimizing polymorphic pluck and repeated inflections#650

Merged
lgebhardt merged 3 commits intoJSONAPI-Resources:masterfrom
DavidMikeSimon:optimization
May 2, 2016
Merged

Optimizing polymorphic pluck and repeated inflections#650
lgebhardt merged 3 commits intoJSONAPI-Resources:masterfrom
DavidMikeSimon:optimization

Conversation

@DavidMikeSimon
Copy link
Copy Markdown
Contributor

Two optimizations are implemented in this PR.

  • Don't use AssociationProxy#pluck on polymorphic relationships if the relationship has already been preloaded; pluck always queries the DB.
  • Profiling revealed that various string operations, particularly inflections, were using a lot of time. This PR adds a simple caching mechanism for the most commonly repeated operations.

On my app, this halves the response times on some requests. For example, a common index with lots of includes used to take between 1.505 and 1.546 seconds to complete, but with this PR that is reduced to between 0.700 and 0.795 (not including the first request, to allow for cache to warm up).

@DavidMikeSimon
Copy link
Copy Markdown
Contributor Author

@lgebhardt This look okay to you?

@cdunn
Copy link
Copy Markdown

cdunn commented Mar 25, 2016

+1 for no n+1

@aaronbhansen
Copy link
Copy Markdown
Contributor

aaronbhansen commented Apr 17, 2016

I noticed a lot of these same things when doing a performance review on our code and opened #674. One thing that would be nice is to have the cache only cache based on the environment settings. So that as your changing your files locally, nothing is cached when you're in development. Possibly move the cache to rails cache so it could be cleared also.

@DavidMikeSimon
Copy link
Copy Markdown
Contributor Author

@aaronbhansen Agreed on having it be env-controlled.

However, I don't think Rails.cache would be good to use here, though; I suspect the overhead of the lookup would be too high.

Another problem that occurs to me looking back over it: my cache is not thread-safe. However, I experimented with using a thread-safe cache (e.g. Hash in a ThreadLocalVar) but overhead was pretty high. :-( I think I need to look into a different way of caching these values that can be forced to run pre-fork.

@aaronbhansen
Copy link
Copy Markdown
Contributor

Rails cache would be great (for clearing / management), but I agree for such time sensitive operations, the lookup would probably be to big.

What would also be nice, is if the caching classes could inherit from the base classes, so that in the configuration you could even pick which class to use. Like the operations processor today

config.operations_processor = :active_record

# options include formatter, cached_formatter
config.formatter = :cached_formatter

It's just a thought, that leaves the base classes alone and allows a caching layer on top. I don't know if that causes more work than its worth, but would allow easy testing between the two.

@aaronbhansen
Copy link
Copy Markdown
Contributor

@DavidMikeSimon we are starting to need this more in production, I don't want to replicate this tickets efforts, but would love to help to get this out

@DavidMikeSimon
Copy link
Copy Markdown
Contributor Author

@aaronbhansen I will look into cleaning up my PR tomorrow, and after that it would be very awesome if you could review it

@DavidMikeSimon DavidMikeSimon force-pushed the optimization branch 2 times, most recently from 82b6e08 to 20b25c6 Compare April 22, 2016 21:09
@DavidMikeSimon
Copy link
Copy Markdown
Contributor Author

@aaronbhansen @lgebhardt Okay, I believe this is now thread-safe. I also made the formatter caching configurable (though I don't think it'll cause any problems even in dev), and added a simple benchmark which you can run with 'rake test:benchmark'.

On my system, here are the results with NaiveCache completely disabled:

                                user     system      total        real
bench_large_index_request .  9.860000   0.020000   9.880000 (  9.887991)

With caching added to LinkBuilder and ResourceSerializer:

                                user     system      total        real
bench_large_index_request .  7.060000   0.000000   7.060000 (  7.074621)

With caching fully enabled:

                                user     system      total        real
bench_large_index_request .  5.440000   0.020000   5.460000 (  5.476966)

@DavidMikeSimon
Copy link
Copy Markdown
Contributor Author

@dgeb Seem alright to you?

@aaronbhansen
Copy link
Copy Markdown
Contributor

Just did a quick test run locally, it did seem to reduce call time by about 50%. Going to push to our staging server and see how it does there. I've dive into the code more, but initial results look promising. Nice work @DavidMikeSimon 👍

@aaronbhansen
Copy link
Copy Markdown
Contributor

We've been testing this branch out on staging the last 5 days. Two things I think need to be taken into consideration on this pull request.

  1. The signature of the formatters did change. We have custom formatters and didn't realize right away they weren't being called since the methods were moved from class calls to instance calls. Merging this pull request in, will require anyone with custom formatters to make a change.
  2. I wonder if we need a cache memoization method on the base formatter for determining if the value we are caching could change. We aren't doing this today in our code, but I could see the possibility of someone using the value being passed in, with user context or some other global scope to possibly format differently. Having a way to override the cache key would allow for those variations.

@DavidMikeSimon
Copy link
Copy Markdown
Contributor Author

@aaronbhansen Good catch on point 1, I forgot to mention that.

Regarding point 2, you can disable formatter caching in the config, or set the cached method on your custom formatter to return a smarter cache that's tolerant of global state changes, or to just return self and do no caching at all.

However, a formatter that's not deterministic on its input is probably a bad idea in general, because JSONAPI needs to be able to reversibly call unformat its output at any later time and get back a string version of the original value. If you have a thing in that could have multiple possible names depending on context, it's a best to just handle that in the relevant Resource(s) by providing alias methods or something similar.

@lgebhardt
Copy link
Copy Markdown
Contributor

@DavidMikeSimon Sorry for the late review of this PR. I like the performance gains.

I'm probably missing something completely obvious, but what was the rational for the changes to the formatters regarding moving the class level methods to instance methods? I'm worried it will break a lot of people as @aaronbhansen has proven. Could we simply cache the class (klass) instead of instances? Even if instances are needed (and I'm not sure they are) it seems leaving the existing class level methods would prevent a lot of upgrade issues.

@DavidMikeSimon
Copy link
Copy Markdown
Contributor Author

@lgebhardt Well, the main reason is that a formatter cache is so small and so frequently accessed that it's better to have a separate cache for each thread than to try and synchronize access to a single cache across threads. I changed the nature of the Formatter classes to represent this idea.

I can easily change it back, though; the cached method will then instantiate a new subclass of FormatterWrapperCache instead of a new instance. I'll make this change now.

@lgebhardt
Copy link
Copy Markdown
Contributor

@DavidMikeSimon I think you could keep the new cache stuff the same and just call the class level methods for format/unformat which would keep the existing formatters the same since they would derive from the modified Formatter base class.

@DavidMikeSimon
Copy link
Copy Markdown
Contributor Author

@lgebhardt Ok, sounds good.

@DavidMikeSimon DavidMikeSimon force-pushed the optimization branch 2 times, most recently from 1a44fd3 to 6a60e7a Compare April 28, 2016 20:00
@DavidMikeSimon
Copy link
Copy Markdown
Contributor Author

@lgebhardt Alright, formatters are back to using class methods.

Comment thread lib/jsonapi/formatter.rb Outdated
class << self
def format(key)
super.camelize(:lower)
key.to_s.camelize(:lower)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we still call super here incase someone wants to add an override in the base?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aaronbhansen I think so.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okies, I'll put that back as super. I think that change was left over from earlier experimentation on making the format methods themselves faster.


alias_method :polymorphic?, :polymorphic

def primary_key
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did these functions just change in order?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, there's one other difference: @resource_klass = become @resource_klass ||= to reduce repeated calls. I'll flip it back around so the diff shows this more clearly.

Comment thread lib/jsonapi/resource.rb Outdated
end

class << self
@@model_types_to_resource_types = {}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is vestigal, I'll remove it.

@DavidMikeSimon
Copy link
Copy Markdown
Contributor Author

@lgebhardt I adjusted the diff in relationship.rb in response to your line note, and also removed a vestigal line from another file.

@lgebhardt lgebhardt merged commit 174bd2f into JSONAPI-Resources:master May 2, 2016
@lgebhardt
Copy link
Copy Markdown
Contributor

@DavidMikeSimon Thanks!

@DavidMikeSimon DavidMikeSimon deleted the optimization branch May 2, 2016 18:39
@DavidMikeSimon
Copy link
Copy Markdown
Contributor Author

@lgebhardt No problem. :-)

I also have another optimization idea that I'm working on in a different branch: caching entire serialized resources. I'll submit a PR when I have something working.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants